[SPARK-48134][CORE] Spark core (java side): Migrate error/warn/info with variables to structured logging framework#46390
[SPARK-48134][CORE] Spark core (java side): Migrate error/warn/info with variables to structured logging framework#46390panbingkun wants to merge 7 commits intoapache:masterfrom
error/warn/info with variables to structured logging framework#46390Conversation
… with variables to structured logging framework
| this.slf4jLogger = slf4jLogger; | ||
| } | ||
|
|
||
| public boolean isErrorEnabled() { |
There was a problem hiding this comment.
It is found that some logic uses this method during migration.
| slf4jLogger.error(msg, arg1, arg2); | ||
| } | ||
|
|
||
| public void error(String msg, String... args) { |
There was a problem hiding this comment.
Why is the method signature not public void error(String msg, Object... args)
Because it conflicts with public void error(String msg, MDC... mdcs)
When we want to use error(String msg, MDC... mdcs), it will enter error(String msg, Object... args)
| */ | ||
| package org.apache.spark.io; | ||
|
|
||
| import com.google.common.base.Preconditions; |
There was a problem hiding this comment.
Taking this chance, let's adjust the import order of java code
| public class ReadAheadInputStream extends InputStream { | ||
|
|
||
| private static final Logger logger = LoggerFactory.getLogger(ReadAheadInputStream.class); | ||
| private static final Logger LOGGER = LoggerFactory.getLogger(ReadAheadInputStream.class); |
There was a problem hiding this comment.
Let's capitalize variables of type static final.
There was a problem hiding this comment.
I would suggest avoiding such unnecessary change.
In https://www.slf4j.org/api/org/slf4j/Logger.html, the example code is using logger as well.
| requestingConsumer); | ||
| if (LOGGER.isDebugEnabled()) { | ||
| LOGGER.debug("Task {} acquired {} for {}", String.valueOf(taskAttemptId), | ||
| Utils.bytesToString(got), requestingConsumer.toString()); |
| underlyingInputStream.close(); | ||
| } catch (IOException e) { | ||
| logger.warn(e.getMessage(), e); | ||
| LOGGER.warn("{}", e, MDC.of(LogKeys.REASON$.MODULE$, e.getMessage())); |
| MDC.of(LogKeys.THREAD_ID$.MODULE$, Thread.currentThread().getId()), | ||
| MDC.of(LogKeys.MEMORY_SIZE$.MODULE$, Utils.bytesToString(getMemoryUsage())), | ||
| MDC.of(LogKeys.NUM_SPILL_WRITERS$.MODULE$, spillWriters.size()), | ||
| MDC.of(LogKeys.TIME_UNIT$.MODULE$, spillWriters.size() > 1 ? " times" : " time")); |
…uctured logging framework ### What changes were proposed in this pull request? Since we are targeting on migration INFO/WARN/ERROR level logs to structure logging, this PR removes the logDebug and logTrace methods from the JAVA structured logging framework. ### Why are the changes needed? In the log migration PR #46390, there are unnecessary changes such as updating ``` logger.debug("Task {} need to spill {} for {}", taskAttemptId, Utils.bytesToString(required - got), requestingConsumer); ``` to ``` LOGGER.debug("Task {} need to spill {} for {}", String.valueOf(taskAttemptId), Utils.bytesToString(required - got), requestingConsumer.toString()); ``` With this PR, we can avoid such changes during log migrations. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing UT. ### Was this patch authored or co-authored using generative AI tooling? No Closes #46405 from gengliangwang/updateJavaLog. Authored-by: Gengliang Wang <gengliang@apache.org> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
|
@gengliangwang |
|
|
||
| package org.apache.spark.util.collection.unsafe.sort; | ||
|
|
||
| import java.io.*; |
There was a problem hiding this comment.
let's revert the unnecessary changes in this file
There was a problem hiding this comment.
Is it the import order of the file header, or from org.slf4j.Logger & org.slf4j.LoggerFactory to org.apache.spark.internal.Logger &org.apache.spark.internal.LoggerFactory
The later change is to prohibit the import of org.slf4j.Logger & org.slf4j.LoggerFactory in java code later, so as to prevent subsequent developers from introducingorg.slf4j.Logger & org.slf4j.LoggerFactory again, thus writing inconsistent logic.
There was a problem hiding this comment.
Since there is no log migration in this file, why do we need to change the import order here?
There was a problem hiding this comment.
The import order has been restored.
I actually have an idea to align the import order of java with the import order of scala.
It's OK, I restored it, 😄
| logger.info("Spilling data because number of spilledRecords crossed the threshold " + | ||
| numElementsForSpillThreshold); | ||
| logger.info("Spilling data because number of spilledRecords crossed the threshold {}", | ||
| MDC.of(LogKeys.SHUFFLE_SPILL_NUM_ELEMENTS_FORCE_SPILL_THRESHOLD$.MODULE$, |
There was a problem hiding this comment.
this key is sort of long. How about NUM_ELEMENTS_SPILL_THRESHOLD?
gengliangwang
left a comment
There was a problem hiding this comment.
LGTM except two comments
|
Thanks, merging to master |
| if (inMemSorter.numRecords() >= numElementsForSpillThreshold) { | ||
| logger.info("Spilling data because number of spilledRecords crossed the threshold " + | ||
| numElementsForSpillThreshold); | ||
| logger.info("Spilling data because number of spilledRecords crossed the threshold {}" + |
There was a problem hiding this comment.
When re-implementing SPARK-27734, I found that there was something wrong with the logs here.
{}" + MDC.of -> {}", MDC.of

What changes were proposed in this pull request?
The pr aims to
1.migrate
error/warn/infoin modulecorewith variables tostructured logging frameworkfor java side.2.convert all dependencies on
org.slf4j.Logger & org.slf4j.LoggerFactorytoorg.apache.spark.internal.Logger & org.apache.spark.internal.LoggerFactory, in order to completelyprohibitimportingorg.slf4j.Logger & org.slf4j.LoggerFactoryin java code later.Why are the changes needed?
To enhance Apache Spark's logging system by implementing structured logging.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Was this patch authored or co-authored using generative AI tooling?
No.